The nature of underlying mental representations is revealed
in errors of the English past tense. Discuss.

Greg Detre

Tuesday, 31 October, 2000

Asma Siddiki

The English past tense provides plenty of interesting evidence for linguists trying to learn about how we learn and use language. To form the imperfect tense of a regular verb, we tend to simply add �-ed� to the end of a verb, so for example, �lift� becomes �lifted�. In the case of irregular verbs, a variety of changes are possible. Sometimes groups of irregular verbs follow patterns, such as �bring/brought� and �think/thought�, often involving vowel changes, e.g. �smite/smote/smitten�, �strive/strove/striven�, �take/took/taken�. In a few exceptional cases, two verbs have merged � so now �go� goes to �go�, but in the past tense, it borrows from how �wend� went to �went�.

�Inflectional morphology� is the study of such changes. Morphology is the area of grammar related to morphemes, the most basic units of meaning in language. A morpheme can be a word root, such as �cran� in �cranberry�, or entire words like �bird�, �cup� or �take�. To this stem, we can add the affixes: prefixes, suffixes and infixes, so called according to where they are added to the stem (English contains no infixes, which are slotted inside the stem, though Swahili does). Thus a word is composed of one or more morphemes joined together. For instance, the word �unremovable� can be broken up into its stem, �remove�, the smallest unit of meaning (any smaller and you get syllables, then phonemes), then add the suffix �able� to get �removable�, then the prefix �un�. Morphemes can be either derivational or inflectional. Inflectional morphemes change the grammatical status of the word, such as the suffix �-ed� which indicates the past tense, whereas derivational morphemes, such as �un-�, change the meaning of the word. We are concerned largely with inflectional morphemes, though our use of derivational morphology displays many of the same properties.

The regular/irregular distinction of past tense forms of verbs is particularly visible in infant language acquisition studies. Marcus et al�s studies on over-regularisation (applying regularisation rules such as �run-ran� in the past tense to �bring� to get �brung�) indicate a conflict in the sequence of their semantic and syntactic developments, as they initially learn the correct form, then learn the regularisation rule and then apply it too widely. It appears that infants learn the irregular verbs quite early (probably because the irregular verbs tend to be amongst the most common, which is itself an interesting observation, as we will see below). Then, they start to learn the regular verb patterns, and are able to generate the past tense of verbs for themselves (it seems most unlikely that an infant would have heard the past and present tense of every verb they use). Interestingly, they appear to over-regularise, and start to employ these regular verb rules to the irregular verbs they had learnt by rote already, for example, �I broked� or �I holded�. Then, in a u-shaped profile of learning, they have to relearn these irregular forms.

There are various accounts of how this might work. Early models focused too heavily on one aspect of a child�s learning, either by arguing that we learn each new verb individually or that we have a large number of complex rules. However, it seems much more sensible now to see this in terms of a rule which can be applied by default, but is inhibited in the case of irregular verbs to allow for the rote-learned exceptional form to be used. We will leave the debate about rules and the symbolic approach as opposed to connectionism aside for now.

Pinker uses nonsense words to illustrate our putatively innate ability to create novel words by following the generation patterns, or �rules�, inherent in the inflectional morphology of a given language. His experiment centred around the invented word, �wug�, and the plural that almost all English speakers assumed it to have, �wugs�. Here it is important that he chose a phonetically legitimate word in English, otherwise it would seem foreign and we might not make the same assumptions about it that we do about English words we have not heard. He takes as an example of innate linguistic mechanisms this ability to learn the rules tacitly operating in language, such as adding �ed� to the past tenses of irregular verbs.

A further interesting avenue of research focuses around word frequency, measurable in terms of the number of occurrences in a corpus of a certain size[1]. The Brown Corpus contains one million words, taken from a range of sources and intended to be representative of the various uses of the English language in both speech and writing. The number of occurrences of a given word (which may or may not include its various forms and senses, depending on the purpose of the experiment) in the corpus can be taken as an indication of its rarity.

The most revealing studies tend to focus on the most common and the fairly rare. For example, of the ten most common verbs in English (e.g. �be�, �go�, �do�, �make�), almost all share one thing in common � they are irregular. Conversely, of the 877 rare verbs which occur only once in the Brown Corpus (though there will be many extremely rare verbs which may not occur in even a million words of text), all but one (�smite�) are regular. This is clearly no coincidence. Pinker argues convincingly that irregular verbs need to be learnt by rote, and so we need to hear them at least once, if not more often. Rare verbs may only crop up once or twice in our early years and maybe not at all in the past tense. As a result, rare irregular verbs do not last long. New generations of speakers who rarely hear the word �slay�, and hear �slew� more rarely still, will find �slayed� quite natural[2]. This is the process by which language evolves, and over time, words lose their irregular forms, just as we have lost �chid� and �dempt� from Middle English, but gained �sneak/snuck� and �catch/caught�.

We can understand why almost all of the rarer verbs are regular because their irregular forms have been forgotten. Indeed, when subjects are faced with an elicited-production experiment where they are asked to finish the statement �I agglutinate, but yesterday I ___�, they almost always opt for �agglutinated�, despite almost never coming into contact with the word. The explanation of why all our most common verbs are so irregular is less clear � certainly their irregular forms would have survived, but it does not seem so clear why the common verbs should have become so irregular. Unless however, we see this creation of new verb forms as the same process that allows children to learn regularisation rules in the first place. In Arkansas, like so many other dialects, different rules apply from the arbitrary standard known as Standard English: �help/holp�, �slid/slood�, and elsewhere �took/tooken�, are examples of this creative regularisation gone wrong. Importantly though, we are not simply adding �-ed� here, but rather regularising into families of irregular verbs (just as infants over-regularise from �ring/rung� to �bring/brung�).

Late learners of English may fail to recognise sentences such as "Yesterday the girl pet a dog." are grammatically incorrect, since these inflections require us to master quite long-range dependencies that contain redundant information. Language utilises redundance to make up for imperfect transmission conditions (e.g. a noisy environment) and to reduce the processing burden for the listener.

The dual-route hypothesis points to the transition from rote-learning to symbolic rule-governed behaviour as the source of the over-regularisation error. Having learned the rule, they then have to re-learn the irregular verbs so that they �block� access to the rule-based route. Strengthening the representational strength of this blocking of irregular verbs results from increased exposure, which is why infrequent irregular forms get over-regularised most. The irregular forms would be consolidated in an associative network that runs in parallel with the rule-based network for representing regular verb forms. Double dissociations in language disorders where children preserve performance on regular verbs but not irregular, and vice versa, lends credence to this dual-route idea.

Connectionist models have been employed to try and demonstrate such effects without explicitly coding for a system of regular rules and irregular exceptions. One influential early connectionist model was a net trained by Rumelhart and McClelland (1986) to predict the past tense of English verbs.

The net was first trained on a set containing a large number of irregular verbs, and later on a set of 460 verbs containing mostly regulars. The net learned the past tenses of the 460 verbs in about 200 rounds of training, and it generalized fairly well to verbs not in the training set. It even showed a good appreciation of "regularities" to be found among the irregular verbs (send / sent, build / built; blow / blew, fly / flew). During learning, as the system was exposed to the training set containing more regular verbs, it had a promisingly infant-like tendency to overregularize (e.g. break / broked, instead of break / broke), correctable with more training.

However, the model is poor at generalising to some novel regular verbs, which Pinker & Prince (1988) point to as a failing of connectionist models in general. They believe that neural nets are good at making associations and matching patterns but they have fundamental limitations in mastering such general rules as the formation of the regular past tense. Connectionist modellers have defended themselves against the wider criticism that neural nets are ill-suited to the generalising necessary to master cognitive tasks involving rules (e.g. Niklasson and van Gelder, 1994).

It seems clear that our use and misuse of English regular and irregular past tense forms of verbs demonstrates an ability to generalise rules and regularities, and to generate forms of previously unheard past tenses when necessary. Indeed, this �creativeness� causes infants to over-generalise when learning new verbs, and even to confuse which verbs are regular, and verbs which are irregular so should not be generalised. Connectionist models need to be able to account for these experimental results, which suggest a dual-route system of rules with a list of exceptions � yet a connectionist model has to represent declarative rules and rote-learned lists in a distributed, non-symbolic way. Yet early models have learned to differentiate regular and irregular verbs within a restricted training domain, and more encouragingly still, have produced over-generalisation in a human child-like way.

[1] An alternative method of quantifying frequency is to count the number of senses of a word, known as �polysemy�. This is particularly easy to do with computerised dictionaries or relational lexical dabatases, like Wordnet.

[2] Pinker et al. actually tested how �natural� different regular and irregular verbs sounded to people experimentally to verify this. People were unsure whether faced with the correct or incorrect form of an irregular verb, since the regular form might not seem quite right, yet the irregular form might sound strange too.

The nature of underlying mental representations is revealed in errors of the English past tense. Discuss.

The nature of underlying mental representations is revealed
in errors of the English past tense. Discuss.